1 00:00:00,000 --> 00:00:13,920 This is Hacker Public Radio episode 3,911 from Monday 31 July 2023. 2 00:00:13,920 --> 00:00:18,680 Today's show is entitled, an overview of the Act Command. 3 00:00:18,680 --> 00:00:21,720 It is part of the series Lightweight Apps. 4 00:00:21,720 --> 00:00:28,240 It is the 150th show of Dave Morris, and is about 21 minutes long. 5 00:00:28,240 --> 00:00:31,000 It carries an explicit flag. 6 00:00:31,000 --> 00:00:36,440 The summary is a Pearl-based Grap-like tool that can search by file time. 7 00:00:40,440 --> 00:00:43,640 Hello everyone, welcome to Hacker Public Radio. 8 00:00:43,640 --> 00:00:52,080 My name is Dave Morris, and I'm talking today about a command, which I have put into the 9 00:00:52,080 --> 00:00:56,560 category or the series of Lightweight Apps. 10 00:00:56,560 --> 00:01:03,680 So what this is, is a Pearl-based tool that behaves like Grap. 11 00:01:03,680 --> 00:01:12,240 It actually uses the name Beyond Grap on the website, which I refer to in the notes. 12 00:01:12,240 --> 00:01:19,280 So this tool is called ACK, quite sure why, but it's a good thing. 13 00:01:19,280 --> 00:01:23,480 And it's got three main features that I use. 14 00:01:23,800 --> 00:01:25,400 I don't use it a huge lot. 15 00:01:25,400 --> 00:01:30,480 It's not a thing I use for every search of a file, but it's great for certain things. 16 00:01:30,480 --> 00:01:37,400 First of all, it can restrict the searches to files of a particular type. 17 00:01:37,400 --> 00:01:42,600 So there's a way of classifying files in terms of type, which I'll talk about in a minute, 18 00:01:42,600 --> 00:01:44,320 and it will search only those. 19 00:01:44,320 --> 00:01:50,160 The regular expressions that it uses, I mean, Grap will handle, plain text, stuff, 20 00:01:50,240 --> 00:01:54,800 and regular expressions of various sorts, including a Pearl-1. 21 00:01:54,800 --> 00:01:58,200 But this one is only Pearl that it uses. 22 00:01:58,200 --> 00:02:03,600 I think it's actually possible to simplify it, but the default is Pearl anyway. 23 00:02:03,600 --> 00:02:10,040 And Pearl, of course, has one of the most powerful feature-rich types of regular expressions. 24 00:02:10,040 --> 00:02:12,640 So that's to me, that's fantastic. 25 00:02:12,640 --> 00:02:17,920 And it's got features like you can limit the search area within a file if you want to. 26 00:02:17,920 --> 00:02:19,280 I'm not going to go into that. 27 00:02:19,280 --> 00:02:24,960 There's a lot to say here if I was to dig deeply into all aspects of this command, 28 00:02:24,960 --> 00:02:27,120 and I'm not going to do it because I don't want to. 29 00:02:27,120 --> 00:02:31,760 Well, you to death, it's almost a series in itself if I were to go into that level of detail. 30 00:02:31,760 --> 00:02:36,240 It's fantastic, I believe, but it's a little complex to use, 31 00:02:36,240 --> 00:02:42,240 and I use it mainly in special cases where I need the features I've just mentioned. 32 00:02:42,240 --> 00:02:47,840 So I'll just give you a flavor of what it can do and leave you to research it more if it sounds interesting. 33 00:02:47,920 --> 00:02:51,360 So you can install it in the usual sorts of ways. 34 00:02:51,360 --> 00:02:54,800 I actually installed it as a package with it. 35 00:02:54,800 --> 00:02:58,800 I used Debian, so I used App to do it, App to install. 36 00:02:58,800 --> 00:03:02,000 So do App to install, App can, that's fine. 37 00:03:02,000 --> 00:03:09,760 I only, as I was preparing the show, I noticed that I was using version 3.6.0. 38 00:03:09,760 --> 00:03:15,840 And that's a little bit behind, and there's a new version 3.7.0, 39 00:03:15,840 --> 00:03:19,600 which you can get details about from the website. 40 00:03:19,600 --> 00:03:25,520 It suggests that you might want to install it as a Pearl module using C-Pan, 41 00:03:25,520 --> 00:03:29,760 but if you're not a Pearl user, these are things you might not want to take on board. 42 00:03:29,760 --> 00:03:33,520 Let's just talk briefly about Pearl regular expressions, 43 00:03:33,520 --> 00:03:39,120 as I said, they're very sophisticated and have grown tremendously over years. 44 00:03:39,120 --> 00:03:44,880 I think Pearl was grand breaker in terms of regular expressions, 45 00:03:44,880 --> 00:03:50,480 because a lot of other regular expression engines follow Pearl's lead. 46 00:03:50,480 --> 00:03:55,040 It's certainly the power of our use, Pearl of quite a long time now, 47 00:03:55,040 --> 00:03:58,080 and the power of it is quite spectacular. 48 00:03:58,080 --> 00:04:04,720 What happened during the last number of years was that there came to be a thing called 49 00:04:04,720 --> 00:04:09,600 the Pearl compatible regular expressions library. 50 00:04:09,600 --> 00:04:14,400 This was put together by a guy called Philip Hazelor from Cambridge University. 51 00:04:14,400 --> 00:04:18,640 He did this in 1997. I was particularly interested in this, 52 00:04:18,640 --> 00:04:26,880 because I was working at RAN XM, the male transfer agent, 53 00:04:26,880 --> 00:04:32,960 and PCR-E, Pearl of a regular expressions, were implemented within it, 54 00:04:32,960 --> 00:04:36,320 and Philip Hazel was the author of XM. 55 00:04:36,320 --> 00:04:42,560 So he did this, and PCR-E is available in quite a lot of other areas. 56 00:04:42,560 --> 00:04:46,480 I think at one point Python used it, I'm not sure it does now, but anyway. 57 00:04:46,480 --> 00:04:51,680 But since the original PCR-E, there's come a version called PCR-E2, 58 00:04:51,680 --> 00:04:57,120 so he's still developing it, and it is very widely used. 59 00:04:57,120 --> 00:05:05,200 The act documentation refers you to various Pearl manuals for detail to use 60 00:05:05,200 --> 00:05:11,200 the regular expression syntax in detail, although you can use it in a simplistic way, 61 00:05:11,200 --> 00:05:15,440 which is what I do mostly. I don't use the full power of it, but it is quite nice 62 00:05:15,440 --> 00:05:18,880 to be able to get into some of the Pearl stuff. 63 00:05:18,880 --> 00:05:22,560 But there's plenty of documentation if you're interested. 64 00:05:22,560 --> 00:05:28,320 In fact, Gnu-Grap, which would be the one that most people will have on a Linux system, 65 00:05:28,320 --> 00:05:33,520 it can use Pearl compatible regular expressions when doing it's matching, 66 00:05:33,520 --> 00:05:37,040 but I can't remember what the option is. Mine is a cabrical piece, 67 00:05:37,200 --> 00:05:43,440 not sure, but if you look at the documentation for Gnu-Grap, it says this is experimental. 68 00:05:43,440 --> 00:05:46,400 I've not really gotten to using that in a big way. 69 00:05:46,400 --> 00:05:50,720 So let's talk about the file type, as you mentioned before. 70 00:05:50,720 --> 00:05:55,680 So this act command has got rules for recognizing file types, 71 00:05:55,680 --> 00:05:58,880 and it does this by looking at the name extensions, 72 00:05:58,880 --> 00:06:02,880 fairly obviously, so dot HTML or dot py. 73 00:06:02,880 --> 00:06:08,960 It can also look at the file's contents to see the sort of first line, 74 00:06:08,960 --> 00:06:14,880 which has got, what do they call that? Some sort of magic thingy. 75 00:06:14,880 --> 00:06:19,040 That's you determine what a file is called. I completely forgot the terminology. 76 00:06:19,040 --> 00:06:24,800 You can find out what Act knows about in terms of types by giving it the option 77 00:06:25,440 --> 00:06:32,640 hifin-hifin help hifin types, or you can use Act based hifin-hifin dump. 78 00:06:32,640 --> 00:06:38,000 Some of the examples are CC, as a type, and those are C files, 79 00:06:38,000 --> 00:06:43,280 Haskell, Haskell files, Lua, Lua files, Python for Python files, 80 00:06:43,280 --> 00:06:46,080 Shell for Bash and other Shell command files. 81 00:06:46,080 --> 00:06:49,760 And these names can be used with the options, 82 00:06:49,840 --> 00:06:59,600 Lua case hifin-hifin-hifin-hifin-hifin type equals the type name, 83 00:06:59,600 --> 00:07:06,240 and also by preceding the type name with two dashes, if you wish. 84 00:07:06,240 --> 00:07:11,680 I think that might be the particular way of doing it might not be available in the future, 85 00:07:11,680 --> 00:07:15,120 because it says it's deprecates it somewhere. That's what I use. 86 00:07:15,120 --> 00:07:26,080 You can also say files not of this type by using hifin-hifin-hifin-hifin type equals no 87 00:07:26,880 --> 00:07:32,000 followed by the type string. Anyway, now I use that much of a sub. I don't usually want to look 88 00:07:32,000 --> 00:07:35,920 at files which are not Shell script, because I'll be everywhere the else and I don't want it. 89 00:07:35,920 --> 00:07:42,080 But it's good that you have it. So we've got a little example here to check files in the current 90 00:07:42,080 --> 00:07:49,920 directory, which are Shell scripts, then you might want to do this with the command 91 00:07:49,920 --> 00:07:56,880 AC as base. Hifin-hifin-shel, space, and then the search string, which in my case is declare. 92 00:07:57,600 --> 00:08:04,960 And I'm actually in the directory where I prepare my, it's better shows, and it finds 93 00:08:05,760 --> 00:08:12,960 one in title bash snippet using co-proc with their skew light. It finds a shell script there 94 00:08:12,960 --> 00:08:20,560 and shows me online 11. There is the occurrence of the word declare. And it's quite nice that 95 00:08:20,560 --> 00:08:25,920 it by default gives you the line number. It also does coloring of these things when you search 96 00:08:25,920 --> 00:08:32,080 all of which you can turn off and mess around with to enormous degrees. You can add your own 97 00:08:32,080 --> 00:08:42,000 file types to hack and there's a configuration file called .acrcacrc, which you can add more types 98 00:08:42,000 --> 00:08:51,920 to and talk about that next. So that configuration file .acrc, and it contains, as the manual says, 99 00:08:51,920 --> 00:08:57,520 command line options that are pre-pended to the command line before processing. So it's useful 100 00:08:57,520 --> 00:09:03,520 way to add new types or modify existing ones. There are a number of places that can be placed 101 00:09:03,520 --> 00:09:08,960 and the documentation will tell you that. But I put mine in my home directory where I keep all my 102 00:09:08,960 --> 00:09:15,680 other configuration files. It can be another other more regular places in the Comfort directory 103 00:09:16,400 --> 00:09:26,080 You can actually create a new .acrc with the option hif and hif and create hif and a ckrc. 104 00:09:26,080 --> 00:09:36,160 So it will write out an example series of settings on sanded out and you can just pipe it to 105 00:09:36,160 --> 00:09:42,480 the file. That's all the defaults that have built into the script, but it means that you've 106 00:09:42,560 --> 00:09:48,160 got them somewhere where you can change them if you wish. Now I have a lot of marked down files 107 00:09:48,160 --> 00:09:54,320 in my directory where I do all my hbr talks. I write everything in marked down. And for some 108 00:09:54,320 --> 00:10:01,840 reason I'm not sure why I did this originally gave them the extension of .mkd. I must have seen 109 00:10:01,840 --> 00:10:07,040 somebody else do that or just seemed like the right thing and I wasn't sure what was better. But 110 00:10:07,040 --> 00:10:17,520 act recognises .md and .markdown as signaling marked down file. So I wanted to add .mkd to the list. 111 00:10:17,520 --> 00:10:24,080 And it was pretty simple to do. There's two commands you can use within your 112 00:10:24,160 --> 00:10:32,960 accuracy. There's one which is dash dash type dash add equals marked down .call on .ext 113 00:10:32,960 --> 00:10:43,120 call on .mkd. So it depends the particular extension to the existing list or you can you can use 114 00:10:43,120 --> 00:10:52,480 dash dash type dash set equals marked down .call on .ext .call on .md .call .mkd .call 115 00:10:52,480 --> 00:10:58,960 marked down. Well that does is to replace the existing settings. That's why it uses set as 116 00:10:58,960 --> 00:11:06,240 supposed to add after type. And you just give it the list. It currently existing list plus 117 00:11:06,880 --> 00:11:13,760 whatever else you've added to it. You can put comments in the file too. So if you then dump the 118 00:11:13,760 --> 00:11:24,160 settings. So in done this you can see that marked down is listed with .md .markdown and .mkd 119 00:11:24,160 --> 00:11:33,040 as detectable extensions. So if I do a search for marked down in the directory I'm in where my 120 00:11:33,120 --> 00:11:40,560 various HPR shows live. I keep them all around forever .ackspace-dashmarkdown 121 00:11:40,560 --> 00:11:49,520 space then in quotes in a well-in-law case space ear. I get back one match, one file match and then I 122 00:11:49,520 --> 00:11:56,400 get a bunch of lines that contain the string in it ear. Now there are a lot of options to act 123 00:11:56,400 --> 00:12:04,720 the general usage pattern for using the command is that you type act followed by list of options 124 00:12:04,720 --> 00:12:11,040 followed by a pattern which is the thing you're matching usually in in files but but come on to that 125 00:12:11,040 --> 00:12:17,840 more in a moment and followed by an optional list of files or directories. If you don't give 126 00:12:18,800 --> 00:12:24,160 a list of files or directories then it will look at the current directory and will recurst 127 00:12:24,160 --> 00:12:32,560 down into subdirectories and the pattern that I mentioned is the PCRE search string which is usually 128 00:12:32,560 --> 00:12:38,560 enclosed in single quotes for so it doesn't get interpreted by the shell. There are some 129 00:12:38,560 --> 00:12:46,320 cases where you don't use a pattern but look at that briefly in a moment. They can look at the full 130 00:12:46,320 --> 00:12:55,280 documentation for the usual man act command and the alternative to doing that is to use 131 00:12:55,280 --> 00:13:02,000 act itself to report it's man page which is act space-dashman. There is also an option 132 00:13:02,000 --> 00:13:07,120 dash-dash help which gives a summary of all the available options which is actually find 133 00:13:07,120 --> 00:13:13,840 more useful because it's usually options I'm trying to remember and it's to scan through 134 00:13:13,840 --> 00:13:19,360 rather the full documentation. There's got a few options to refer to here. There's quite a lot 135 00:13:19,360 --> 00:13:25,280 that are specific to act and some of the sentences meant to be a prep standing there are some 136 00:13:25,280 --> 00:13:30,160 which are common to prep but I won't look at too many. Well the first one is one that you do find 137 00:13:30,160 --> 00:13:37,760 in prep which is dash-i and that makes the the pattern matching case incentive. It's possible to 138 00:13:37,760 --> 00:13:46,000 do that within the pearl expression. You can say I want these only to be matched with. I don't care 139 00:13:46,000 --> 00:13:54,000 about the case when matching but it's a lot easier to use it as an option I find. Then we get to 140 00:13:54,880 --> 00:14:04,160 dash-f which is about searching for files by name so it only prints the files that would be 141 00:14:04,160 --> 00:14:10,480 searched. It comes back with the list of file names. It doesn't do any searching but it's useful 142 00:14:10,480 --> 00:14:18,960 for finding things within a directory which are a particular type or match a pattern or whatever. 143 00:14:18,960 --> 00:14:27,440 dash-g is the same as dash-f but in this case you use a pattern and you look for files which match 144 00:14:27,600 --> 00:14:34,560 that pattern whose names match the pattern like the contents. But it that overlaps what you get 145 00:14:34,560 --> 00:14:42,720 with the type business so it's useful in some cases but can be a little bit misleading. Then we get 146 00:14:42,720 --> 00:14:50,560 dash-l which reports the file names which contain matches for a given pattern. So it's not actually 147 00:14:50,640 --> 00:14:56,320 showing you the matches but you're showing you the file which would match and the dash capital 148 00:14:56,320 --> 00:15:03,520 hell reports file names which do not match the pattern. Then you've got dash-c which you'll 149 00:15:03,520 --> 00:15:11,680 get in-grap which reports file names and the numbers of matches when you use it. So it actually 150 00:15:11,680 --> 00:15:18,560 reports all files that match whatever it is you're matching against in terms of file types or names. 151 00:15:18,560 --> 00:15:26,480 It reports them all and it gives you a count of zero if there are no matches which I find bit 152 00:15:27,840 --> 00:15:34,720 bit of pain not that useful but if you use it with hyphen L then you only see the names of 153 00:15:34,720 --> 00:15:42,560 files that have matches and a count of the matches. So I've just used it yesterday looking through 154 00:15:42,560 --> 00:15:47,840 a bunch of files to see if any of them had a particular string in them because I needed to 155 00:15:47,840 --> 00:15:56,560 edit it because it was a grammar grammatical error and so using the hyphen C and I from L was 156 00:15:56,560 --> 00:16:03,840 a great way to do it and then we have dash-w which forces the search pattern to match only 157 00:16:03,840 --> 00:16:08,720 whole words. So a lot of times there isn't a way of doing that in-grap that I know of. 158 00:16:08,720 --> 00:16:14,000 I think you're wrong actually. I think some of the the regular expression capabilities of 159 00:16:14,080 --> 00:16:20,880 Grap will allow you to do that anyway but it's sometimes useful to be able to say look that's 160 00:16:20,880 --> 00:16:26,400 sequence of characters I've just given you that I want you to look for is the word or our words 161 00:16:26,960 --> 00:16:34,880 as opposed to just being an ABC anywhere in any any text. So there's a lot. There's actually a lot 162 00:16:34,880 --> 00:16:41,440 of power in this possibly too much. I don't use it all by any means. I've got a few examples. 163 00:16:41,920 --> 00:16:49,200 The first example is looking for all mark-down files in a directory. So first option was to use the 164 00:16:49,920 --> 00:17:00,400 dash f option. So I typed ack space dash dash mark-down space dash f space and then the 165 00:17:00,400 --> 00:17:07,680 name of a sub-directory nightcore tube torch. So sure I did some time ago and it comes back with the list 166 00:17:07,680 --> 00:17:16,400 of names which are the mark-down files within the within that directory or mkd files. 167 00:17:17,280 --> 00:17:22,640 Now there are many other ways you could do that. You could use the find command to find them. 168 00:17:22,640 --> 00:17:28,800 That's that would have been what I would have used in the past but I find ack just as a nice 169 00:17:28,800 --> 00:17:37,680 job. But given alternative here using the dash g option so ack space dash g and then we're using 170 00:17:37,680 --> 00:17:46,320 a pattern. The pattern is open-quote backslash dot which matches an actual dot mkd dollar 171 00:17:46,320 --> 00:17:52,160 closed quote or that are saying any file name you get back which ends with dot mkd and then the 172 00:17:52,160 --> 00:17:58,560 name of directory nightcore tube torch and it comes like the same file names achieved by a different 173 00:17:58,960 --> 00:18:05,200 method. I think I would use the former in pretty much all cases but if you don't have a type that you 174 00:18:05,200 --> 00:18:16,240 can use to do the search then the g thing is an alternative. So what about finding the names of files, 175 00:18:16,240 --> 00:18:25,040 listing the names of files that contain a match with some string and the number of matches per file. 176 00:18:25,120 --> 00:18:32,720 So this one is the act command followed by dash dash mark done to look into markdown files again. 177 00:18:33,120 --> 00:18:40,880 This time I'm using dash LCI so I've can catonated three of the options together which you can do 178 00:18:40,880 --> 00:18:48,240 with single character options and by the way these options are usually I think in all cases have 179 00:18:48,320 --> 00:18:56,960 no not all cases but many cases have a single character version and a double hyphen followed by a 180 00:18:56,960 --> 00:19:03,840 long version so but I'm using the short versions here for demonstration purposes so LCI means 181 00:19:03,840 --> 00:19:12,400 use the options dash L dash C and dash I. Now my match string my pattern is 182 00:19:13,280 --> 00:19:24,000 open quote, bxb ea rxb now that's using one of the pearl regular expression capabilities which is 183 00:19:24,560 --> 00:19:33,680 to denote a boundary and this case a word boundary. So the the bxb is can be an opening 184 00:19:34,640 --> 00:19:43,360 boundary or closing one. So it's saying look for the word ea ea r which is a word a 185 00:19:43,360 --> 00:19:49,680 standalone word. Now there are other ways of doing that you can do something like that in grep 186 00:19:50,320 --> 00:19:58,480 but the regular expression syntax is very in the way that this word boundary thing is done and 187 00:19:59,200 --> 00:20:07,280 gets a little bit messy but the point is that it's separating out the sequence ea r from 188 00:20:07,840 --> 00:20:15,440 for example in the word pearl as an ea r but you don't want pearl to be returned because it's not 189 00:20:15,440 --> 00:20:21,520 a word I mean it's a word but I mean it doesn't got the word ea ea in it so example three 190 00:20:21,680 --> 00:20:30,720 just looking forward in a simpler way similar to example two where we used the bxb boundaries 191 00:20:30,720 --> 00:20:39,200 you can achieve this alternatively by making the pattern simpler just the word ea r ea and 192 00:20:39,200 --> 00:20:48,080 preceding it with dash w which as I mentioned before says treat this as a word or words I think but 193 00:20:48,080 --> 00:20:54,320 I might have experiment with that but in both of these cases what you get back is list of 194 00:20:54,320 --> 00:21:02,720 file names and each file name is followed by colon and a number which tells you how many matches 195 00:21:02,720 --> 00:21:10,560 for that word exist in the file. There's the same in both cases of course in example two and three 196 00:21:10,560 --> 00:21:17,360 so I thought about stop with the examples here because it's getting a bit too far a bit too long 197 00:21:17,360 --> 00:21:25,120 otherwise so that's it then there's some references to the pearl reference manuals and tutorials 198 00:21:25,120 --> 00:21:32,000 and stuff and the the site where ack can be found and details about it so a little bit of that 199 00:21:32,000 --> 00:21:39,280 that's the end and I hope you found that useful okay then bye 200 00:22:02,000 --> 00:22:08,160 and our synced of net on this otherwise stages they show is released on their creative 201 00:22:08,160 --> 00:22:16,160 comments attribution for pointo international license